Goto

Collaborating Authors

 convergence behavior


Estimating Mixture Distributions via Stochastic Mirror Descent

arXiv.org Machine Learning

We revisit the classical problem of estimating an unknown distribution from its samples by fitting a mixture model that minimizes cross-entropy loss. Framing the task as a stochastic convex optimization problem over the space of $ M $-component mixture distributions, we propose a family of estimators derived from the stochastic mirror descent (SMD) algorithm. This optimization-based approach provides a principled and flexible framework that generalizes traditional estimators and proposes a variety of novel estimators through the choice of Bregman divergences. A key advantage of our method is that it scales efficiently with the number of candidate components $ f_i $; that is, one can employ a large set of basis distributions in the mixture model without incurring significant computational overhead. This enables richer approximations and improved estimation accuracy. Moreover, in the case of categorical distribution (discrete outcomes) our estimators do not require a strict lower bound, in other words our framework does not require the precise knowledge of the support of the distribution. We demonstrate that, under mild conditions, the proposed $ ฯ†$-SMD estimators achieve near-optimal convergence rates in both Kullback-Leibler (KL) divergence and $ \ell_2 $-norm and offer practical benefits when computation is expensive. Our numerical analysis highlights improved performance guaranties over classical estimators, particularly in terms of sample efficiency and scalability.


Mean Field for the Stochastic Blockmodel: Optimization Landscape and Convergence Issues

Neural Information Processing Systems

Variational approximation has been widely used in large-scale Bayesian inference recently, the simplest kind of which involves imposing a mean field assumption to approximate complicated latent structures. Despite the computational scalability of mean field, theoretical studies of its loss function surface and the convergence behavior of iterative updates for optimizing the loss are far from complete. In this paper, we focus on the problem of community detection for a simple two-class Stochastic Blockmodel (SBM). Using batch co-ordinate ascent (BCAVI) for updates, we give a complete characterization of all the critical points and show different convergence behaviors with respect to initializations. When the parameters are known, we show a significant proportion of random initializations will converge to ground truth. On the other hand, when the parameters themselves need to be estimated, a random initialization will converge to an uninformative local optimum.



Mean Field for the Stochastic Blockmodel: Optimization Landscape and Convergence Issues

Neural Information Processing Systems

Variational approximation has been widely used in large-scale Bayesian inference recently, the simplest kind of which involves imposing a mean field assumption to approximate complicated latent structures. Despite the computational scalability of mean field, theoretical studies of its loss function surface and the convergence behavior of iterative updates for optimizing the loss are far from complete. In this paper, we focus on the problem of community detection for a simple two-class Stochastic Blockmodel (SBM). Using batch co-ordinate ascent (BCAVI) for updates, we give a complete characterization of all the critical points and show different convergence behaviors with respect to initializations. When the parameters are known, we show a significant proportion of random initializations will converge to ground truth. On the other hand, when the parameters themselves need to be estimated, a random initialization will converge to an uninformative local optimum.




Benchmarking VQE Configurations: Architectures, Initializations, and Optimizers for Silicon Ground State Energy

arXiv.org Artificial Intelligence

Quantum computing presents a promising path toward precise quantum chemical simulations, particularly for systems that challenge classical methods. This work investigates the performance of the Variational Quantum Eigensolver (VQE) in estimating the ground-state energy of the silicon atom, a relatively heavy element that poses significant computational complexity. Within a hybrid quantum-classical optimization framework, we implement VQE using a range of ansatz, including Double Excitation Gates, ParticleConservingU2, UCCSD, and k-UpCCGSD, combined with various optimizers such as gradient descent, SPSA, and ADAM. The main contribution of this work lies in a systematic methodological exploration of how these configuration choices interact to influence VQE performance, establishing a structured benchmark for selecting optimal settings in quantum chemical simulations. Key findings show that parameter initialization plays a decisive role in the algorithm's stability, and that the combination of a chemically inspired ansatz with adaptive optimization yields superior convergence and precision compared to conventional approaches.



Appendices for " Finding Second-Order Stationary Points Efficiently in Smooth Nonconvex Linearly Constrained Optimization Problems " A Details of Implementation of Algorithms

Neural Information Processing Systems

In this section, we will elaborate more about the ideas of designing SNAP . First, we give the main motivation of selecting the update directions. Next, we will give the detailed algorithm description of the line search used in SNAP . A.2 Line Search Algorithm To understand the algorithm, let us first define the set of inactive constraints as A Lemma 2. If there exists an index i A (x Therefore, the line search algorithm reduces to the classic unconstrained update. If so, then the algorithm either touches the boundary without increasing the objective, or it has already achieved sufficient descent.